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1, INTRODUCTION 

In the last decade the amount of data generated on a daily basis has exponentially exploded. 
Numerous efforts have been undertaken to provide the industry and research fields with expanding 
technologies for facilitating the collection, treatment and sharing of massive data. This poses unprecedented 
challenges for the use of these technologies, especialy in the banking sector. Therefore, the central bank 
governors of the G10 and Switzerland created the Basel Committee on Banking Supervision (BCBS) in order 
to assist them in monitoring and exchanging data tasks based on common approaches and standards. BCBS 
reports that one of the reasons for the financial crisis of 2008 was associated with problems in data 
architecture and information technology infrastructure. Moreover, inadequate or non-existent of common 
vocabulary has created a gap in common semantics. As a consequence, for many banking systems, assertions 
or decisions that depend on the data cannot be reliable. 

In the talk at 2010, Gov 2.0 Expo in Washington, D.C., Tim Berners-Lee suggested awarding stars 
for any kind of sharing data [1]. Therefore, governments earn three stars for sharing data on the web a non 
proprietary format, for stars on putting it in linked data and a full five stars for connection data to other data. 
Following the recommendations of BCBS and Tim Berners-Lee, an ontology-based approach has been 
proposed to assist different financial institute in standardizing, monitoring, and exchanging data in credit 
risk field. 

Ontologies have a lifecycle: they are created, implemented, evaluated, fixed, exploited, and reused 
like any artifact. Numerous methodologies and methods for developing and modeling ontologies have been 
proposed in recent years; yet, none of them constitutes a normalized reference [2]. In this paper, we use 


Journal homepage: http://ijeecs.iaescore.com 


430 O ISSN: 2502-4752 


ontology design patterns (ODPs) for modeling credit risk scorecard as worked example. ODPs are reusable 
solutions for recurrent ontology modeling problems, which permits our ontology to be reused and expanded. 
In [3, 4] first proposed this idea with the aim to clearly design ontologies used as the basis for building other 
ontologies and to foresee the effect of changes or extensions to them. It is also rarely the case that you, as an 
ontology engineer, having the set of requirements for your ontology engineering task at hand, will fully agree 
with all the ontological commitments that are made in such a large ontology. However, not reusing any well- 
established practices at all, and not aligning your ontology partly with existing ontologies will create 
problems in the interoperability, and potentially also, understandability of your ontology. Hence, there is a 
trade-off between interoperability, on the one hand, and over-commitment and conflicting requirements on 
the other hand. This is where ODPs as small “building blocks” offer one way to manage this trade-off. ODPs 
carry the promise of better integration and interoperability of data across various domains [5]. There are 
several types of ODPs and they can be reused and applied in many different ways. [6] has identified them and 
grouped them into six families shown in Figure |: Structured OPs, Correspondence OPs, Content OPs (CPs), 
Reasoning OPs, Presentation OPs, and Lexico-Syntactic OPs. 

A Content OPs (CP) can be considered roughly as analogous to a software design pattern with the 
added benefit that it includes a reference base implementation (in the form of an OWL building block) ready 
for immediate customization [7]. 

[6] noted that each CPs is associated with a catalogue entry including the following set of 
information fields. Name provides a name for the pattern; Intent describes the generic use case addressed by 
pattern; Competency Questions (CQs) contains examples of CQ that the knowledge base associated with the 
CPs needs to address; Diagram depicts a UML class diagram representing the pattern; Elements describes the 
classes and relations included in the pattern. Also, CPs is described with the Scenario, Consequence, Known, 
Extracted from/Reengineered from and Related patterns. Thereby, it should be noted that, to our knowledge, 
there are two ways to model with ODPs — The first approach is called the eXtreme Design Methodology [8] 
and the second, which we follow here, is mostly inspired from [9] and [10]. The later consists of: 

1) Formulating use case(s) for which the ODP is intended. 

2) Modeling diagram and the appropriate logical axiomatization identified from the use case descriptions. 

3) Verifying the resulting ODP to ensure its quality by checking the set of its axioms, computing some of 
its logical consequences, and populating it with sample data, which sometimes exposes shortcomings. 

In this work, we have chosen to use CPs as defined in the NeoN Project to model our ontologies, as 
this is the most common type of ODPs with some 100+ patterns published [11]. 

The remainder of this paper is organized as follows: in section 2, we introduce the related works. In 
section 3, we discuss the credit risk scorecard concepts, then, we show the process of creating an ontology 
credit risk scorecard pattern for credit monitoring in section 4. In section 5, we evaluate our ontology model, 
and in the last section we conclude with discussing future work. 
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Figure 1. Ontology design pattern’s family 


2. RELATED WORK 

In recent years, ontologies have been increasingly used in industry and research fieldsfor different 
purposes. The development ofsemmantic domain ontology can help in reducing common problems and 
ambiguities linked to tag based systems [12]. Ontologies representing the domain knowledge have been used 
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to guide the design of the application and to supply the system with semantic technologies 
possibilities [13-16]. The general ontology that models the credit risk management process and two specific 
ontologies have been proposed [17]. One of these specific models the process of credit allocation to clients, 
while the second displays necessary concepts for monitoring of a credit system. Ontologies are beneficial 
when they are used in decision support systems [18]. The later work proposes an integrated ontological 
model for evaluating client applications which incorporates both: the default risk of investment and the 
development component of the investment. [19] proposes an approach for the conceptualization and the 
definition of business rules present in governance policies of the Brazilian financial system, more specifically 
those related to risk management. It proposes an ontology called Onto-Bacen that expresses the concepts (and 
their relationships) of this domain, and by using inference algorithms, which can verify the compliance of 
hypothetical financial institutions with those policies. 

Decision making is one of the main research themes of systems’ science, and decision support 
systems (DSS) were developed in many areas; e.g., management decision making, group decision making, 
etc. Authors in [20] developed an ontology based on decision support systems to assess the risk factors and 
provide appropriate treatment suggestions for diabetic patients. [21] presents an ontology-driven method for 
multi-criteria decision making that explicitly focuses on ensuring that the consequences of each choice are 
considered. In [22] the authors combine between mobile technology and fuzzy ontology and group decision- 
making algorithms, to facilitate the mobilization of knowledge, giving the user the possibility to get decision 
making support from the dynamic and massive data through their mobile services. 

[23] Attempts to present an ontological risk analysis in which the authors integrate three different 
perspectives on risk: (I) risk as a quantitative notion, (II) risk as a chain of events that impacts an agent’s 
goals, (III) risk as the relationship of ascribing risk. Despite their effort, their proposal remains a superficial 
and simplified approach. Moreover, the implementation of this ontology in the analysis of risk in a specific 
domain, such as credit risk, remains a tedious task that may not succeed even though the life cycle of the 
development of the ontology is not specified and the ontology is not validated. 

However, the above-mentioned solutions are not conclusive. Among their major draw backs we can 
cite: requiring a lot of manpower and time to manually integrate them, designed and modeled for specific 
contexts with strong commitments and great details rendering their reusability and expandability difficult. To 
address some of these limitations we developed a flexible conceptual architecture based on ontology design 
patterns. 


3. CREDIT RISK SCORECARD 
3.1. Requirements and Use Case 

The credit-granting process leads to two choices: granting a loan to a new customer or declining his 
application. The purpose of the credit risk scorecard is defined as “the assignment quantitative measure to a 
potential borrower to provide an estimation of its capacity to repay a loan” [24]. Usually, logistic regression 
is the most used techniques to build credit risk scorecard. More recently, artificial intelligence techniques like 
expert systems and neural networks have been used [25]. All of them involve establishing and quantifying 
the relationship between the characteristics and good/bad performance (target). 

The use case that drives our modeling allows the credit analyst to calculate the number of points of 
each applicant Table 1(a). It also provides them with a decision support tool Table 1(b) which allows them to 
immediately reach an opinion on the credit allocation. 

The modeling has to be as general as possible so that it becomes possible to add information from 
any sources (web, credit bureau data, financial ratios, social networks...etc.). The schema has to be robust, 
extendable, and easy to be maintained and updated. In order to achieve these objectives the credit risk 
scorecard and decision support tool are modeled with ontology design patterns. 

The ontology is built to address a set of requirements; in fact, it is evaluated against its 
corresponding requirements specification. These requirements can be defined through appropriate CQs, 
which define the scope of ontology. Some typical CQs are listed below: 

1) Which variables take part in the credit risk scorecard? 

2) Which credit risk scorecard does this variable participate in? 

3) What is the value of a given category”? 

4) What is the score band of low, medium and high scores? 

5) What does good and bad client mean? 

6) How many counts of a client type are bad? 

7) How many counts of a client type are good? 

8) What is the risk of an applicant who is under 25 applying for credit for the first time at the institution 
with no other credit, no non-payments, with an account having slightly positive balance (but less than 
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€200), with a small amount of savings (less than €500), and without a guarantor applying for credit for 
36 months? 

9) What is the risk of an applicant aged over 25 with credits in competing institutions, without non- 
payments, with an account having an average balance of more than €200, with more than €500 in 
savings, and without a guarantor applying credit for 12 months? 


Table l(a). Example of credit scoring scorecard Table 1(b). Example of credit scoring decision support 


___ Variable CC lassVal0Nbpoints _ tool 
Age >25 years 0 __Nbpoints  ——Creditt 
<=25 years 8 Frequency Good Bad Total 
Other_credits No credit 0 Percent 
Other banks or ¥ Row Pct 
institution Low risk 389 37 426 
Accounts No cheking account 0 38.90 3.70 42.60 
CA >= 200 euros 13 91.31 8.69 
CA [0-200 euros] 19 Medium risk 239 137 376 
CA <0 euros 25 23.90 13.70 37.60 
Credit_duration <= 15 months 0 63.56 36.44 
16-36 months 13 High risk 72 126 198 
> 36 months 18 7.20 12.60 19.80 
Savings No savings or >500 0 36.36 63.64 
euros 8 Total 700 300 1000 
<500 euros 70.00 30.00 100.00 
Guarantees Guarantor 0 TS 
No Guarantor 21 
Credit_history No credit at any time 0 
Credits without delay 6 
Creditswithnon- 13 
payments 


3.2. Data Sources 

Once we have a set of CQs as listed previously, we take a closer look at the data sources and the 
data structure. The used data set is credit risk scorecard Table 1(a) and credit scoring decision support tool 
Table 1(b) [26] which are the result of credit scoring modeling by logistic regression applied to ‘German 
credit data’. The latter contains 1000 consumer credit files of which 700 ‘Good’ applicants (no non- 
payments) and its 300 ‘Bad’ applicants (non-payments) and 19 independent variables. 

The credit risk scorecard contains the selected variables, their division into categories (Class Val0) 
and the weight per category for each variable (nbpoints). The decision support tool is divided in three score 
bands (low risk, medium risk, high risk); each band contains the number and the percentage of good and bad 
applicants. 

The credit risk scorecard will make it possible to respond to some CQs, e.g., questions (1) and (2). 
However, the credit risk scorecard alone will not allow the possibility of addressing CQs as in numbers (5), 
(6), (7) and (8) requiring the decision support tool, and (4) which would require more fine-grained 
information about the objectives of the project for which the credit risk scorecard is developed. 

For additional information, possible sources are plentiful, mainly when the applicant’s file needs 
close analysis and examination, e.g., credit bureau data, social network, demographic data. For the moment, 
attention will be given to model existing sources, credit risk scorecard and decision support tool. However, at 
the same time our modeling should be expandable as possible to include other data sources. 


4. CREDIT SCORECARD MODELING 
4.1. Ontology Modeling 

We have a working group with a mix of participants: domain expert, participants specializing in data 
base addressed by modeling, and ontology engineer working on ODP-based modeling process. Reconciling 
the differing perceptions of domain expert to their topic, credit risk scorecard and decision support tool, - as 
data providers - and the CQs cited above leads to an expandable scope. Thus, requires data integration which 
is part of our use case. The modeling of our ontology will make the data publication simpler and graph 
structures intuitive, as will as making its reusability easier [10]. 

Based on the credit riskscorecard and decision support tool formats discussed earlier, and based on 
our CQs, we can identify several key notions which will be in need of modeling :agent, event, variables, 
categories, decision support tool, credit risk scorecard support tool and credit risk scorecard. 
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Let us start with the notion of the Agent. In order to borrow from best practices and realize that 
“being an agent” is actually a role which an agent can take (e.g. Developer, Analyst, and Manager Etc.), for a 
certain period of time. We therefore reuse the common ontology design pattern for this purpose, which is 
depicted in Figure 2 in exactly the form in which it is used, e.g., in [27]. We opted for adapting this pattern 
for our specific case, leaving most of the things untouched, while avoiding overgeneralizations. The resulting 
pattern is depicted in Figure 3. Note that we have introduced three different agent roles: Analyst, Developer 
and Manager. The yellow frames indicate that a more complex entity (a pattern in its own right) could stand 
in place of the frame. The latter could be modeled with a more fine-grained model, or alternatively, an 
existing ODP could be directly used instead. 
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endsAtTime 
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Figure 2. The agent role Content ODP’s UML graphical representation 
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Figure 3. Adapted agent role ODP for credit risk scorecard player roles UML graphical representation 






rdfs: suibeiasnOt 


a. 





Let us turn our attention to credit risk scorecard, from the viewpoint of the team’s domain expert, 
the edition and validation of credit risk scorecard is an event, or alternatively as a step of credit risk scorecard 
development Figure 4. Therefore we suggest to reuse a generic event pattern, such as the one depicted in 
Figure 5, in exactly the form in which it is used, e.g. in [27]. 


Preliminary Initial Characteristic Preliminary 
Scorecard ==> ear (Enow > Scorecard (EGB) 
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Final Scorecard ee 
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Figure 4. Credit risk scorecard development steps 





atTime providesAgentRole performedBy 


Figure 5. The event Content ODP’s UML graphical representation 
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As stated earlier, we specialize the pattern for our specific purpose, the result of which is depicted in 
Figure 6. Note that we linked directly the place to string containing the name of the place in which the credit 
risk scorecard was developed. The development of credit risk scorecard can be in-house or by external 
vendors. In contrast, if we had directly linked credit risk scorecard to the string containing the name of 
modelization algorithm, we would not have to provide an extension in the future. At the same time, we reuse 
the agent role pattern. However, the sub-event Figure 4 and temporal information are not treated at this stage, 
yet we intend to provide the possibility to make this extension later without having to change anything 
already modeled. 


atTime 


developedin 





hasName 


Figure 6. The credit risk scorecard as event 


The next part we deal with is the variables; since we have settled for viewing the credit risk 
scorecard as event, the variables are naturally participants in the event. Therefore, we suggest reusing a 
generic participation pattern such as the one depicted in Figure 7(a) in exactly the form in which it is used, 
e.g., in. Following the same approach, we adapt this pattern to fit our specific case, so we prefer to leave most 
of the things untouched. The resulting pattern is depicted in Figure 7(b). 

Considered from the viewpoint of our team’s domain expert, the categories are classifications of 
variables. Therefore, we suggest reusing a generic classification pattern such as the one depicted in Figure 
8(a) in exactly the form in which it is used, e.g., in. We intend to adapt this pattern for our specific case. The 
resulting pattern is depicted in Figure 8(b). 


hasParticipant 


isParticipantin 





Figure 7(a). The participation Content ODP’s UML Figure 7(b). The variables as participation 
graphical representation 
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Figure 8(a). The classification content ODP’sUML Figure 8(b). The categories as classification 
graphic representation 


Moving to credit scoring decision support tool, the numbers of points are divided into bands. 
Therefore, the band risks (low, medium, high) are classifications of credit scoring decision support tool. 
Here, as in the previous case, we suggest to reuse a generic classification pattern depicted in Figure 8(a). The 
resulting pattern is depicted in Figure 9. It is worthy of note that we have introduced three different Band 
risks, two different Clients (Good and Bad) and the credit risk scorecard objectives. As mentioned above, and 
regarding future extension, we do not directly link client and credit risk scorecard objectives to strings 
containing the definition of client and credit risk scorecard objectives. 
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Figure 9. The credit risk scoring decision support tool as a pattern 


4.2. Relating Things Together 

Finally, after the development of each piece of the whole credit risk scorecard, we intend to 
assemble all the pieces, producing the result depicted in Figure 10. Note that we directly linked credit risk 
scorecard to credit risk scorecard objectives and credit scoring decision support tool. 


4.3. Validation of The Ontology 

Apart from the evaluation undertaken by the domain experts, our ontology must be validated in 
order to develop a reference and reusable ontology in the credit risk field. Therefore, since another credit risk 
ontology was not available in order to make a feasible comparison, we decided to use an approach in the 
ontology evaluation field. According to [28], any approach of ontology evaluation has to take into 
consideration several criteria: accuracy, completeness, conciseness, adaptability, clarity, computational 
efficiency and consistency. 

Many approaches in the ontology evaluation field have been discussed in [29]. Therefore, we opted 
for an automatic online evaluation framework for OWL ontologies which contains a common pitfalls 
catalogue [29] and which respects most of the criteria of ontology evaluation [28]. The online framework 
evaluation detected no errors in any of the indicators supported by OOPS. Since we used the Protégé 
ontology editor, we also opted for HermiT1.3.8.413 [30] reasoners bundled in it for testing its consistency. 
The result revealed that no contradictions have been found in our ontology. 
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Figure 10. The credit risk scorecard model 
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5. CREDIT RISK ASSESSMENT 
The system consists of two main ontologies, the credit risk scorecard ontology and the applicant 
profile ontology. They are implemented using Web Ontology Language (OWL), with the Protégé tool [31]. 
The proposed reasoning algorithm for the applicant analysis is illustrated in Figure 11. 


Financial charactenstics of the applicant 


Credit risk scorecard 
ontology 


: { number of points for each characteristic, 
ae os posi sum of points obtained } 





Figure 11. Reasoning algorithm 


5.1. Credit Riskscorecard Ontology 
This ontology, shown in Figure 12 is the core of the system. It contains information about the credit 


risk scorecard model as in Figure 10. 
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Figure 12. Credit risk scorecard ontology 
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5.2. Applicant Profile Ontology 

Applicant profile ontology, shown in Figure 13, is an OWL file that encapsulate applicant details as 
entered by the credit risk scorecard analyst. This file 1s generated as soon as the applicant submits the loan. 
The concept applicant profile has 20 properties: these attributes are presented in Table 2. 
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Figure 13. Applicant ontology 





> @ 36 


Table 2. The applicant profile — list of attributes 


Type of attribute Attribute Type 
Credit information Accounts Qualitative 
Credit duration Numerical 
Credithistory Qualitative 
Creditpurpose Qualitative 
Creditamount Numerical 
Applicant personnel __Personalstatus Qualitative 
profile Age in years Numerical 
Telephone Qualitative 
Applicantfinancial Savings Qualitative 
information Presentemployment Qualitative 
Installment rate in percentage of disposable income Numerical 
Guarantees Qualitative 
Presentresidencesince Numerical 
Property Qualitative 
Otherinstallment plans Qualitative 
Housing Qualitative 
Othercredits Numerical 
Job Qualitative 
Number of people being liable to provide maintenance for Numerical 
Foreignworker Qualitative 


5.3. Ontology Testing and Result 

The first step is the analysis of the financial characteristics of the applicant. It assigns the 
appropriate number of points (weight) for each characteristic (Age, Other_credits, Acounts, Credit_duration, 
Savings, Guarantees, Credit_history) based on the credit risk scorecard ontology. By way of example, we 
will calculate the number of points for each characteristic for two applicants that are detailed in CQs 8 and 9 
above. For instance, the applicant variable Age is divided into two categories : > 25 years and <= 25 years. 
Every characteristic has a calculated number of points. Table 3 shows the weight, and sum of points obtained 
for the two applicants’ characteristics. 


Table 3. Sample of points for applicants’ characteristics 


Charcteristic Applicant Age | Other_credits | Acounts Credit_duration Savings Guarantees _—_Credit_history | sum 
1 (CQ 8) 8 0 19 13 8 21 0 69 
2 (CQ 9) 0 7 13 0 0 21 0 41 
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The second step deals with decision making by comparing two parameters: the sum of points 
obtained and the Nbpoints in credit risk scorecard ontology. At 69 points, the first applicant reached a 
medium risk, while the second applicant, with 41 points, has a low risk. 

The main differentiating component for this approach is the use of an ontology for representing the 
domain knowledge. This ontology has been used to guide the design of the solution and to supply the system 
with the semantic possibilities, which makes credit data meaningful. The whole approach helps the entire 
financial institution, from analyst to manager to understand both the business opportunities and the power of 
what can be gleaned from the data. 


6. CONCLUSION AND FUTURE WORK 

In this paper we used an ontology design patterns (ODPs) for modeling credit risk scorecard to 
improve the credit risk management. The modeling of our ontology will directly link credit risk scorecard to 
credit risk scorecard objectives and credit risk scoring decision support tool. The ontology is then used to 
identify the corresponding tasks, subtasks, roles, actors and resources of the affected business. Our proposal 
is expandable and aims at satisfying various needs; it provides modular, reusable, replaceable pieces, will 
make the data publication simpler and graph structures intuitive.In the meantime, the ontology that we 
developed havebeen made available on the Web as OWL file, where specific financial institutions (e.g. 
Credit bureau) may then use and augment it, as we are currently working on submitting it in the ontology 
design patterns (ODPs) portal. To enrich our proposal in the future work, we will define ontology’s classes 
(e.g. credit risk scorecard) via axiomatization. The axiomatic method constitutes a common framework for 
the discussion of scientific problems for people coming from different backgrounds and even for people 
working in different but strongly related branches of the same discipline (e.g. credit risk scorecard analyst, 
manager and developer).In the end, we suggest that the proposal presented in this paper is a stepping stone 
towards a more complex and sophisticated credit risk ontology. For instance, extensions may consist of the 
development of credit risk scorecardontology as shown in Figure 4 and recommendation of credit product 
ontology. 
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